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Abstract 

We present wannier90, a program for calculating maximally-localised Wannier func- 
tions (MLWF) from a set of Bloch energy bands that may or may not be attached 
to or mixed with other bands. The formalism works by minimising the total spread 
of the MLWF in real space. This done in the space of unitary matrices that describe 
rotations of the Bloch bands at each k-point. As a result, wannier90 is independent 
of the basis set used in the underlying calculation to obtain the Bloch states. There- 
fore, it may be interfaced straightforwardly to any electronic structure code. The 
locality of MLWF can be exploited to compute band-structure, density of states and 
Fermi surfaces at modest computational cost. Furthermore, wannier90 is able to 
output MLWF for visualisation and other post-processing purposes. Wannier func- 
tions are already used in a wide variety of applications. These include analysis of 
chemical bonding in real space; calculation of dielectric properties via the modern 
theory of polarisation; and as an accurate and minimal basis set in the construc- 
tion of model Hamiltonians for large-scale systems, in linear-scaling quantum Monte 
Carlo calculations, and for efficient computation of material properties, such as the 
anomalous Hall coefficient. wannier90 is freely available under the GNU General 
Public License from http://www.wannier.org/. 
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PROGRAM SUMMARY 



Title of program: wannier90 
Catalogue identifier: 
Program Summary URL: 

Program obtainable from: CPC Program Library, Queen's University of Belfast, 
N. Ireland, or from the web-page |http: //www.wannier.org/| 

Licensing provisions: This program is distributed under then GNU General 
Public License v2.0 (see http://www.gnu.org/ for details) 

Computers for which the program has been designed and others on which it 
has been operable: any architecture with a Fortran 90 compiler 

Operating systems under which the program has been tested: Linux, Windows, 
Solaris, AIX, Tru64 Unix, OSX 

Programming Languages used: Fortran 90, perl 

Libraries required: 

• BLAS (Ihttp: / /www/netlib.org/blas ) 



• LAPACK (http://www.netlib.org/lapack) 
Both available under open-source licenses. 
Memory used to execute with typical data: 10 Mb 
CPU time required to execute test cases: 1 min 
No. of bits in a word: 32 or 64 
Has the code been vectorised or parallelised?: No 

Number of bytes in distributed program including test data, etc. : 4 600 000 
Distribution format: gzipped tar 

Keywords: Electronic structure, density-functional theory, Wannier functions 

Nature of physical problem: Obtaining maximally-localised Wannier functions 
from a set of Bloch energy bands that may or may not be entangled. 

Method of solution: In the case of entangled bands, the optimally-connected 



2 



subspace of interest is determined by minimising a functional which measures 
the subspace dispersion across the Brillouin zone. The maximally-localised 
Wannier functions within this subspace are obtained by subsequent minimisa- 
tion of a functional that represents the total spread of the Wannier functions 
in real space. For the case of isolated energy bands only the second step of the 
procedure is required. 

Unusual features of the program: Simple and user-friendly input system. Wan- 
nier functions and interpolated band structure output in a variety of file for- 
mats for visualisation. 

References: 

(a) N. Marzari and D. Vanderbilt, "Maximally localized generalized Wannier 
functions for composite energy bands", Phys. Rev. B 56, 12847 (1997) 

(b) I. Souza, N. Marzari and D. Vanderbilt, "Maximally localized Wannier 
functions for entangled energy bands", Phys. Rev. B 65, 035109 (2001) 



LONG WRITE-UP 
1 Introduction 

Within the independent-particle approximation, the electronic ground state 
of a periodic system may be solved in terms of a set of extended Bloch states 
-i/> n k(r). These states are characterised by the good quantum numbers n and k, 
which refer to the band index and crystal momentum, respectively. Although 
this choice is widely used in electronic structure calculations, alternative repre- 
sentations are available. For example, the Wannier representation constitutes 
a description in terms of localised functions labeled by R, the lattice vector 
of the cell in which the function is localised, and a band-like index n. 

Wannier functions give a real-space picture of the electronic structure of a 
system. They provide insight into the nature of the chemical bonding and can 
be a powerful tool in the study of dielectric properties via the modern theory 
of polarisation. 

The phase indeterminacy e 1 ^^ of an isolated Bloch state ip n k(r) at each wave- 
vector k results in the Wannier functions being non-unique. For a group of 
isolated bands, such as the valence states of an insulator, this indeterminacy 
is more general as they may undergo arbitrary unitary transformations 
amongst themselves at each k. Two of the authors (NM and DV) developed a 
procedure that iteratively refines these degrees of freedom such that they lead 
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to Wannier functions that are well-defined and localised (in the sense that 
they minimise the second moment around their centres) [1]. We refer to this 
procedure as the Marzari-Vanderbilt (MV) scheme and the resulting Wannier 
functions as maximally- localised Wannier functions (MLWF) . 

The MV method was designed to be applicable to isolated sub-groups of bands, 
i.e., a group of bands that, though they may have degeneracies with each other 
at certain points, are separated from all other bands by finite gaps throughout 
the Brillouin zone. An important example is the set of valence bands of an 
insulator. In many applications, however, the bands of interest are not isolated 
and one is interested, for instance, in the partially filled bands of a metal close 
to the Fermi level. In this case the bands of interest lie within a limited energy 
range but are attached to, or cross with, other bands that extend further out 
in energy. Such bands are referred to as entangled bands. 

The complication associated with treating entangled bands is that the states 
of interest hybridise with unwanted bands. As a result, the number of bands at 
each k-point in the relevant energy range may be greater than the number of 
Wannier functions N that are required. Before the MV localisation procedure 
may be applied, a prescription for extracting iV bands at each k-point from 
the entangled manifold of states is required. Three of the authors developed 
just such a disentanglement procedure, allowing MLWF to be determined from 
a set of entangled bands [2]. This Souza-Marzari-Vanderbilt (SMV) strategy 
breaks down the procedure into two steps: first, the correct iV-dimensional 
subspace of bands at every k is selected; and second, N MLWF are localised 
from this subspace, exactly in the same way as for an isolated set of bands. 

wannier90 is a tool for obtaining MLWF from a set of (possibly entangled) 
energy bands using the methods of MV and SMV. The principal ingredient 
that is required from an electronic structure calculation is the overlap matrix 
between the periodic part |w n k) of Bloch states at neighbouring k-points (de- 
scribed in more detail in Section 2). This matrix is small and independent of 
the basis used in the underlying calculation to obtain the Bloch states. As a 
result wannier90 may be interfaced to any electronic structure code. At the 
time of writing, wannier90 is able to: 

1. Disentangle an optimally-connected iV-dimensional subspace of bands at 
each k-point and obtain the unitary transformations that generate MLWF 
from a given set of bands; 

2. Output MLWF in a number of formats suitable for visualisation; 

3. Generate interpolated band structures, densities of states and Fermi sur- 
faces and output them in formats suitable for visualisation; 

4. Output matrix elements of the Hamiltonian operator in the MLWF basis. 

It is worth recording here a brief historical timeline of wannier90: 
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• 1997: Two of the authors (NM and DV) develop and implement localisation 
algorithms in Fortran77; interface to the all-bands ensemble-DFT castep 
code [3]; 

• 1998: Algorithms implemented in the cpmd code [4] by Silvestrelli [5]; 

• 2001: Three of the authors (IS, NM and DV) develop the SMV disentan- 
glement extension; interface to a full-potential-linearized-augmented-plane- 
wave (FLAPW) code [6] and to any generic electronic-structure code quickly 
followed; 

• 2006: Two of the authors (AAM and JRY) completely rewrite the code 
in Fortran90, employing modern programming techniques and adding new 
functionality. 

One significant benefit of the 2006 rewrite has been that in most cases wannier90 
is faster than the original Fortran77 version by an order of magnitude, and in 
some by more than two. Another is that now the source code is easy to fol- 
low, which makes interfacing it to electronic structure codes more straightfor- 
ward. Indeed, the first such example is already in place within the quantum- 
ESPRESSO package [7]. 

The formalism has seen many and diverse applications: linear-scaling quantum 
Monte Carlo [8], photonic crystals [9,10], metal-insulator interfaces [11], as an 
efficient interpolator for the anomalous Hall effect [12] and electron-phonon 
couplings [13], and a powerful tool for the study of large-scale systems [14,15], 
to cite only a few. In addition, MLWF are playing an increasing role in bridging 
density-functional approaches and strongly-correlated ones, to derive model 
Hamiltonians or as a starting point for LDA+U or LDA+DMFT [16,17,18]. 
They are also closely related to muffin-tin orbitals [19,20]. 

The remainder of this paper is organised as follows. In Section 2 we review 
briefly the theoretical background to the problem of obtaining MLWF and a 
recent theoretical development in using them to interpolate band structures 
on arbitrarily fine k-point meshes. In Section 3 we give an overview of some 
of the more technical aspects of the wannier90 code. In Section 4 we describe 
the overall structure of the code. We demonstrate how to install and run 
wannier90 in Sections 5 and 6, respectively. Finally, in Section 7 we provide 
a number of example calculations and applications of the code. 



2 Theoretical Background 

For an isolated set of N Bloch bands ^ n k(r), a set of N Wannier functions 
w n n(r) = w n (r — R), n e [l,iV], labelled by Bravais lattice vectors R, may 
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be constructed as 
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where is a unitary matrix that mixes the bands at wave- vector k, and the 
Brillouin zone (BZ) integral may also be interpreted as a unitary transforma- 
tion. Different choices for give rise to different Wannier functions, with 
different spatial spreads, that always describe the same manifold. The MV 
strategy consists of choosing the U^ k ^ that minimise the sum of the quadratic 
spreads of the Wannier functions about their centres: 
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where (6) n = O n = (w n0 \6\w n0 ). 

It is worth noting that the spread functional may be decomposed into two 
contributions 



Q = Hi + ^, 



(3) 



where 
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(w n0 \r 2 \w n0 ) - E \(w m R,\r\w n0 ) 
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and 



^ = E E \(wmn\r\w n0 ) 

n mR^nO 



(5) 



It can be shown that each of these quantities is non-negative and that VL\ is 
gauge invariant, i.e., insensitive to changes in the matrices [1]. Therefore, 
the minimisation of the spread functional for an isolated set of bands just 
corresponds to minimising Q. 
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2.1 Reciprocal- Space Formulation 



As shown by Blount [21], matrix elements of the position operator between 
Wannier functions may be expressed in reciprocal-space as 

V r _ 

(w nR \r\w m0 ) = J e lk ' R (w nk |Vk|w m k) dk (6) 



and 



(w nB ,\r 2 \w m0 ) 



V 



3 ik R (Mnk|V^|« mk ) dk, 



(7) 



where, as usual, the periodic part of the Bloch function is defined as w„ k (r) = 
e -lk ' r -?/v tk (r). These expressions enable us to express the spread functional in 
terms of matrix elements of Vk and V 2 .. 

It is assumed that the Brillouin zone is discretised on a uniform Monkhorst- 
Pack mesh [22] on which the Bloch orbitals are determined. Thus, the gradient 
and Laplacian operators may be approximated by finite-difference formulas on 
the k-space mesh. If f(k) is a smooth function of k, then 

V k /(k) = 5>ab [/(k + b) - /(k)] (8) 

b 



and 

(/(k)|V 2 |/(k)> = |V k /(k)| 2 = $> 6 [/(k + b) - /(k)] 2 , (9) 

b 

where {b} are vectors connecting mesh-point k to its nearest neighbours and 
Ub is a weight factor associated with each shell of neighbours b = |b|. The 
choice of b- vectors and weights uJb is discussed in further detail in Section 3.2. 

It is worth noting that a finite k-point mesh implies an approximation to both 
the self-consistent ground state that is obtained using that mesh and to the 
accuracy of the finite-difference representation of the operators V k and V 2 .. In 
principle, for a given mesh of k-points, the latter may be improved by using 
higher-order finite-difference expressions. 

Having discretised Eqns. 6 and 7 in reciprocal space, the only information 
required for computing the spread functional is the overlap matrix 

= (w mk K )k+b >. (io) 
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After some algebra, the two components of the spread functional may be 
expressed as 
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(12) 



where vVk p is the number of k-points in the Monkhorst-Pack grid and r n is the 
centre of the n th Wannier function, given by 

^Hblmln^' (13) 



Using these expressions, the gradient of the spread functional with respect to 
infinitesimal unitary rotations of the ij) n k(r) may be calculated as a function 
of It is then straightforward to evolve U^) t (and consequently M^^) 

towards the solution of maximum localisation using a steepest-descents or 
conjugate- gradient minimisation algorithm. 

It is worth noting that Q may be further decomposed into band-diagonal and 
band-off-diagonal parts 

Q = Q d + Q d, (14) 



where 



^d = E E \( w nMw n0 )\' 2 (15) 
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2.2 The Case of Entangled Bands 



The above description is sufficient to obtain MLWF for an isolated set of 
bands, such as the valence states in an insulator. In order to obtain MLWF for 
entangled energy bands, e.g., for metallic systems or the conduction bands of 
an insulator, we use the "disentanglement" procedure introduced by SMV [2]. 

The strategy proceeds as follows. An energy window (the "outer window") 
is defined by the user such that at each k-point there are N^l > N states 
within the window. An orthonormal set of N Bloch states is obtained at each 
k-point, defining an iV-dimensional subspace <S(k), by performing a unitary 
transformation amongst the states that fall within the energy window: 

K£>= E « k) Kk>, (19) 



where U dis(k) is a rectangular N^l x N matrixQ 

Recall that Qj is invariant under gauge transformations within a given sub- 
space. Thus Q\ may be considered as a functional of <S(k). The subspace se- 
lection proceeds by minimising fij with respect to the matrices U dls ^ k ^ [2]. For 
a physical interpretation of this procedure, Eqn. 11 for Qj may be rewritten 

as 



N] 



2jo;f,tr PkQk+b 



k p k,b 



(20) 



where Pk = J2n=i \ u nk)(u n ]i\ is the projector onto <S(k), and Qk = 1 — Pk. 
Now it can be seen that Qi is a measure of the degree of mismatch between 
the subspaces S(k) and <S(k + b). Minimising Qi corresponds to choosing 
self-consistently at every k the subspace that has maximum overlap as k is 
varied. 

Once the projected iV-dimensional subspace at each k-point has been found, 
the MV localisation procedure described above may be used to minimise Q 
within that subspace and hence obtain MLWF. As an alternative to this two 
step minimisation, a procedure to minimise the total spread function Q has 
been proposed [23] and was found to give very similar MLWF to the present 
scheme. 



1 As U dls ( k ) is rectangular, this is a unitary operation in the sense that 
(U dis ( k ))tTJ dis(k) = I. 
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It may be the case that the energy bands of the projected subspace do not 
correspond to any of the original energy bands due to mixing between states. 
In order to preserve exactly the properties of a system within a given energy 
range (e.g., around the Fermi level) a second inner energy window is intro- 
duced. States lying within this inner, or frozen, energy window are included 
unchanged in the projected subspace. The reader is referred to Ref. [2] for 
further details of the algorithm. 

It is worth noting that the MLWF themselves are never explicitly constructed 
unless required for visualization or other post-processing purposes. Minimisa- 
tion of the spread functional results in finding the converg ed U dis ( k ) (if disen- 
tanglement was used) and U^ k ^. Along with M^ k ' b \ this is sufficient to define 
the centres and spreads of the MLWF. If the periodic parts of the Bloch wave- 
functions are available, then the MLWF may be calculated on a real-space 
grid. For systems with time-reversal symmetry, we always find the MLWF 
corresponding to the minimum spread to be real, apart from a global complex 
phase factor, in empirical agreement with a recent claim of a mathematical 
proof [24] . 

2.3 Wannier Interpolation 

Having found U dls ^ k - ) and for the target system, it is straightforward to 
express the Hamiltonian in the basis of MLWF. We first obtain the Hamilto- 
nian in the basis of rotated Bloch states 

H (w \k) = (U^(U di < k) YH{k)U di < k) UM (21) 

where H nm (k) = e n \J>nm- Next we find its Fourier sum 

H nm (R) = -^£e- ik ' R tfW(k). (22) 
iv k 



This operation is done once and for all for each of the N lattice vectors R lying 
in a supercell conjugate to the k-mesh (in practice we choose a Wigner-Seitz 
supercell centred on the origin [25]). H nm (TV) can be recognised as the matrix 
elements of the Hamiltonian between MLWF. Due to the strong localisation 
of the MLWF, the matrix elements H nm (TV) decay rapidly with R. In the 
spirit of a Slater-Koster interpolation scheme [26] this allows us to compute 
the Hamiltonian on a much finer mesh of k-points in the original Bloch space. 
We can perform the inverse Fourier transform 

/Wk') = E eik '' R #™n( R )> ( 23 ) 

R 
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to yield the interpolation of Eqn. 21 onto an arbitrary k-point k'. The final 
step is to diagonalise H nm (k!) to obtain the interpolated band energies. 

By construction, the interpolated band energies coincide with the true band- 
structure at the original k-points of the Monkhorst-Pack mesh (if the SMV 
scheme is used, this is only guaranteed to occur within the inner energy win- 
dow). At intermediate k-points, the accuracy of the interpolation is dependent 
on the density of the original mesh [14,25]. 

Within wannier90 this interpolation procedure can be used to obtain plots 
of band structure, density of states and the Fermi surface at modest compu- 
tational cost. The Wannier interpolation formalism is rather general and can 
also be used to interpolate arbitrary periodic operators [25]. 



3 Some Technical Aspects 

3. 1 Initial guess for iterative minimisation 

The iterative minimisation of Q\ begins with an initial guess for the subspaces 
«S(k). One possible choice is to start from the initial, random phases of the 
Bloch states provided by the ab initio code. Alternatively, we may define a 
set of N trial functions g n { r ), n £ [1>A], as an initial approximation to the 
N MLWF. These are projected onto the cell-periodic parts of the N^ n Bloch 
eigenstates inside the energy window: 

7V< k > 

win 

|0 nk ) = ]T AW\u mk ), (24) 

m=l 

where A$ n = (u m k|<7n) is an x N matrix. Orthonormalising the resulting 
N functions {|0„k)} via a Lowdin transformation, we find 

N JV<M 

K P k ) = E (S- 1/2 Ul0 mk > = J2(AS- 1/2 ) mn \u mk ), (25) 

m=l m=l 

where S mn = S£l = (0 m k|0rik) = (A^A) mn , and AS^ 1 ^ 2 is used as the initial 
guess for U dls ( k ) . 

The same trial orbitals {|<? n )} can also be used to initialize the minimization of 
Cl. Using a similar projection procedure to the one described above, an initial 
guess for the N x N unitary matrices TJ( k ) is obtained at each k-point. 
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We now come to the choice of trial orbitals. As the minimisation scheme is 
quite robust, one option is to choose a set of spherically-symmetric Gaussian 
functions whose centres are chosen randomly; wannier90 supports this option. 
On the other hand, a user may wish to apply chemical and physical insight in 
order to select a better starting point. An ab initio code that is interfaced to 
wannier90 may use any localised functions desired to construct the projections 
A$ n . For convenience, we have defined a standard set of projection functions 
that should suffice for most situations. This is the set of atomic orbitals of the 
hydrogen atom, which is a convenient choice for two reasons: first, analytical 
mathematical forms exist in terms of the good quantum numbers n, I and m; 
and second, hybrid orbitals (sp, sp 2 , sp 3 , sp 3 d etc.) may be constructed by 
simple linear combination \<p) = J2nim C n im\nlm) , for some coefficients C n \ m . 
The angular parts of our projection functions are not the canonical spherical 
harmonics Y lm of the hydrogenic Schrodinger equation, but rather the real (in 
the sense of non-imaginary) states 0; mr , m T G [1,2/ + 1], obtained by unitary 
transformation of the Y\ m . 

Each localised function is associated with a site and an angular momentum 
state. Optionally, one may define the spatial orientation, the radial form and 
the diffusivity for each function. wannier90 is able to project onto functions 
with s, p, d and f symmetry, plus the hybrids sp, sp 2 , sp 3 , sp 3 d, sp 3 d 2 . The 
user is referred to the documentation in the wannier90 distribution for math- 
ematical definitions and details on how to specify projection functions in the 
input file. 

In general, the spread functional Q has local minima and, occasionally, the 
minimisation becomes trapped in one. In other words, the final solution may 
depend on the initial choice for «S(k) and hence {g n (r)}. In most cases, we 
find that these local minima give rise to MLWF that are complex, i.e., they 
have significant imaginary parts. In some cases, MLWF associated with local 
minima of Q are found to be real and the reader is referred to Ref. [27] for 
more details. 



3.2 Choosing the b vectors 

As discussed in Section 2.1, in order to compute the spread functional we 
require a finite-difference formula for Vk- We now describe an automatic pro- 
cedure to choose, for cells of arbitrary symmetry, the simplest such formula. 

It is assumed that the Brillouin zone is discretised on a uniform Monkhorst- 
Pack mesh [22]. The vectors {b} connect each mesh-point k to its nearest 
neighbours. iV B h shells of neighbours are included in the finite-difference for- 
mula, with M s vectors in the s th shell. For Vk to be correct to linear order we 
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require that (see Eqn. Bl of Ref. [1]) 

E«».EW = ^ (26) 

s i 



where h hS , j e [1, M s ], is the i th vector belonging to the s th shell with associated 
weight w s , and a and (3 run over the three Cartesian indices. If /(k) is a smooth 
function of k, then its gradient may be expressed according to Eqn. 8. For the 
case of a linear function /(k) = /o+g-k, this finite-difference formula gives the 
exact result Vk,» = Y,b w bb a 9f3bf3 = b a fi9fi = 9a, where summation convention 
over repeated Greek indices is assumed. 

In order to find the weights w s , we notice that Eqn. 26 is symmetric in the 
Cartesian indices, therefore, there are six independent elements that may be 
expressed through a matrix equation 

Aw = q, (27) 



where q is a vector of length six containing the six elements of the lower 
triangular part of 8 a p, w is the vector of weights with length N^, and A 
is a 6 x iV s h matrix given by Aj jS = J2iba S b l ^ s , where there is a one-to-one 
correspondence between j e [1,6] and the six independent pairings of a and 
p. A may be factorised using a singular value decomposition [28], 

A = UDV T , (28) 

which permits inversion of Eqn. 27 and solution for the shell weights, 

w = VD^lfq. (29) 



Our automatic procedure for choosing the shells is as follows: we add shells in 
order of increasing |b|; with each additional shell we obtain the shell weights, 
and check to see if Eqn. 26 is satisfied; if not, then we add a new shell and 
repeat the procedure. A new shell may be linearly dependent on the existing 
shells, in which case one or more of the singular values will be close to zero 
and we reject the new shell. When the Bravais lattice point group is cubic, the 
first shell of nearest neighbours is sufficient. The maximum number of shells 
required is six (for triclinic symmetry). For the case of a very elongated unit 
cell it may be necessary for the routine to search through a large number of 
shells in order to find the correct combination. 

With the above finite-difference formalism, the quadratic spread converges 
only polynomially with the sampling of the Brillouin zone. This slow con- 
vergence is a property of the spread operator, whereas the underlying MLWF 



13 



converge rapidly with the k-point density [29] . It may be possible to achieve im- 
proved accuracy and k-point convergence by using higher-order finite-difference 
formulas for Vk, but this has not been explored. 

In some cases, the automatic procedure outlined above might not find the 
shells and weights that give the most spherically symmetric representation of 
the position operator. When this happens, special care may be required to 
ensure that the most symmetric choice of shells is made [6], and this can be 
done explicitly (i.e., by hand) in the input file. 

3.3 Specific Algorithms for Y -Point Sampling 

For isolated molecules or extended systems of large unit cell, single T-point 
sampling in the reciprocal space can provide an accurate description of the 
physical quantities of interest. When the T-point is sampled exclusively, the 
computational cost can be reduced by exploiting the symmetries of the overlap 
matrices M( r ' b ). 

First, at T, the b- vectors are linear combinations of the primitive vectors of 
the reciprocal lattice, thus imposing the following conditions: 

Vyr+b(r) = V^r(r) , w„,r+b(r) = e- lhr u nr (r). (30) 
Then, it follows that M( r,_b ^ is the Hermitian conjugate of M( r ' b ), 

M mn h) = (Umr\u n ,r+b) = (u n ,T+b\Umr)* = (u n T \ U m ,T-b) * 

= (.u '-- »■)" . 

y- y± nm J 

Second, at T, the Bloch eigenfunctions may be chosen to be real. Then, the 
overlap matrices become symmetric, M^ b ) = M ( ^°\ and only the upper or 
the lower half of M^ r ' b ^ is independent. 

wannier90 includes a T-only branch of algorithms that is able to exploit these 
symmetries. These algorithms may be activated by means of a logical keyword 
in the input file and they rely on the above relations being satisfied, i.e., the 
Bloch wavefunctions must be real. 

There are several other advantages beside the significant decrease in the num- 
ber of operations. In the disentanglement procedure [2], the diagonalisation of 
a complex Hermitian matrix at each iteration step is replaced by that of a real 
symmetric matrix, which ensures that U dls( - r ) is real. The localisation proce- 
dure in the T-only branch adopts an efficient algorithm proposed by Gygi et 
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al. [30]. This method minimises f2oD (not Q) by simultaneously diagonalizing 
a set of real symmetric matrices, {ReM' r ' b ',ImM' r,b ^}. The from this 
method is inherently real, giving real MLWF as a final product. 

It is worth mentioning that these MLWF are not necessarily identical to those 
obtained from the MV localisation procedure unless f2o is strictly zero due to 
symmetry of the system. 



3.4 Representative Timings and Convergence 



In this section, timings and convergence characteristics for representative cal- 
culations are presented. The initial band-structure calculations are performed 
using the PWSCF code [7]. The self-consistent ground state is obtained, then 
the required overlap matrices and projections are calculated using the post- 
processing routine pw2wannier90, supplied with the PWSCF distribution. wannier90 
is then used to obtain the MLWF. The PWSCF and pw2wannier90 calculations 
were performed on a four-processor (dual dual-core) Intel Woodcrest 5130 
2.0GHz desktop computer and the wannier90 calculations on a single proces- 
sor of the same machine. The gauge-invariant and gauge-dependent spreads 
were converged to a tolerance of 10~ 10 A 2 . 



3.4-1 Crystalline Silicon 

First, we consider a two-atom unit cell of crystalline silicon. A kinetic-energy 
cutoff of 25 Ry is used for the plane- wave expansion of the valence wavefunc- 
tions. The core-valence interaction is described by means of norm- conserving 
pseudopotentials in separable Kleinman-Bylander form [31]. Table 1 gives tim- 
ings for obtaining the valence MLWF for various densities of Monkhorst-Pack 
k-point grid. Bond-centred s-type functions were used for the initial projec- 
tion. It can be seen that the most time consuming part of the procedure is 
the computation of the band structure, followed by construction of M^ b ) 
and A$ n , with wannier90 taking only 5-7% of the total time to perform the 
localisation of the MLWF and to write files for their visualisation. 

In Figure 1 the effect of using different initial projections for the MLWF is 
demonstrated. It is worth noting that even when choosing randomly-centred 
s-type functions the correct MLWF are found in less than 10% of the total 
time. 
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iVkp 


2scf 


Tm,a 


Tw90 


Niter 


64 


7 


5 


< 1 


5 


512 


52 


44 


4 


14 


1728 


190 


142 


13 


30 


4096 


462 


329 


56 


96 



Table 1 

Timings (in seconds) for the valence states of crystalline silicon with (i) different 
densities of Monkhorst-Pack mesh. Tscf is the time taken by PWSCF to obtain the 
ground state wavefunctions at all A^ p k-points of the Monkhorst-Pack mesh, Tm,a 

fk b) (k) 

is the time taken by pw2wannier90 to calculate Mmn and vlmn, and Twgo is the 
time taken by wannier90 to localise and plot the MLWF. N^er is the number of 
iterations required to minimise the total spread. 




Iteration number 

Fig. 1. Convergence of total spread 0, for crystalline silicon with an 8 x 8 x 8 
Monkhorst-Pack k-point grid. Solid curve: bond-centred s projection. Dashed curve: 
atom-centred sp 2 -like projection. Dotted curve: atom-centred s and p projection. 
Dot-dashed curve: randomly centred spherical s-type projection, f^o is the converged 
total spread and is independent of the initial projection used. 

3.4.2 Fullerene 

We consider an isolated fullerene molecule (Ceo) as a test case for the T-only 
algorithms. The molecule was placed in a cubic supercell of side-length 40 ao 
(21.2 A), and the calculations were performed with T-point sampling, ultrasoft 
pseudopotentials [32], and a plane- wave cut-off of 30 Ry. 

Fullerene has 120 valence states in total. For the localisation of the valence 
MLWF, two choices of initial projections are compared, one being 120 randomly- 
centred s-orbitals and the other being 120 s-orbitals placed on the MLWF 
centres generated by the former run. The disentanglement procedure is tested 
on the extended 150 states comprising 120 valence states and 30 states that 
cover the full 7r*-manifoid. Those 30 states are disentangled from the 100 un- 
occupied eigenf unctions spanning up to 7.5 eV above the HOMO level. The 
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final set of MLWF are found to consist of 60 atom-centred p z orbitals and 90 
bond-centred cr-bonding orbitals, similar to the MLWF shown in Fig. 4. 



Projection 


r 


N 


Tm,a 


^W90 


^dis 




^loc 


/\r. loc 

iter 


randomly-centred s 


F 
T 


120 
120 


901 

686 


33 
1 


n/a 
n/a 


n/a 
n/a 


33 
1 


226 
16 


bond-centred s 


F 


120 


893 


18 


n/a 


n/a 


18 


117 




T 


120 


694 


< 1 


n/a 


n/a 


< 1 


11 


bond-centred s, 


F 


150 


1595 


46 


22 


75 


24 


83 


atom-centred p z 


T 


150 


1017 


12 


11 


75 


1 


9 



Table 2 

Timings (in seconds) for MLWF in fullerene from general k-point and r-specific 
schemes. Different initial projections are tested. "T" in the second column (L) in- 
dicates that the T-only branch of algorithms is used. N is the number of MLWF 
obtained. Tm,a is the time taken to calculate and Ami, and Twgo is the total 

time taken by wannier90 to disentangle and localise MLWF. The individual time 
and the number of iterations for each of these operations are given as T opcr and 
N°^ r (where oper=dis or loc), respectively. 

Timings for these calculations are summarized in Table 2. Tm,a decreases by 
more than 20% when the T-only branch is used, because only half the b- vectors 
are needed. The post-processing routine pw2wannier90 is not yet optimized 
to take full advantage of real wavef unctions, and we expect at least 50% of 
time reduction in the fully optimized case. 

The minimisation method used in the T-only branch is apparently more effi- 
cient than the conjugate- gradient algorithm, especially when the initial pro- 
jections are far from the final, converged MLWF. vanishes in the case of 
a simple cubic lattice, and Q from the two different minimization methods 
converges to an identical value up to the given tolerance. Tdis decreases by 
50% for the reason discussed in Section 3.3, but Nfi.™ is the same, as it should 
be, and fl\ is identical within machine precision. 



4 Structure of the Program 

The schematic structure of the program is outlined in Fig. 2. Each box repre- 
sents a Fortran90 module and the lines represent module dependencies. Mod- 
ules only use data and subroutines from lower modules in the diagram. A 
description of the purpose of each module is given below. 

• constants: definition of constants (e.g., 7r) 
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• io: error handling, timing, and input and output units 

• utility: commonly used operations such as conversion of Cartesian to frac- 
tional co-ordinates, string functions, matrix multiplication wrapper etc. 

• parameters: all physical parameters relevant to the calculation and sub- 
routines for reading them from the input file at the start of a calculation. 
Subroutines for writing checkpoint files for restarting previous runs. 

• kmesh: set up of the framework for reciprocal-space derivatives and, if in 
post-processing mode, writing a file which communicates to an ab initio 
code information on how to calculate M^j^ and 

• overlap: reading of the overlap matrices and 

• disentangle: if using entangled energy bands, finding the optimal subspace 
within a specified energy window by minimising the gauge-invariant spread 

• wannierise: finding the unitary transformations amongst the energy 
bands which give rise to MLWF 

• plot: routines for output of Wannier functions, Fermi surface and band 
structure in file formats suitable for visualisation 

• wannier_prog: the main program 

• wannier_lib: library routines. wannier90 may be invoked directly from 
within an ab initio code as a set of library calls. The reader is referred to Sec- 
tion 6.1 for a brief description and to the documentation in the wannier90 
distribution for full details. 



5 Installation 

wannier90 is distributed as a gzipped tar file (|http:/ /www.wannier.org/). On 
Linux platforms, for example, it may be unpacked by typing 

> tar zxvf wannier90 . tar . gz 

which creates a directory containing the source files, documentation, examples 
etc. 

Compilation is straightforward. From the conf ig directory of the distribution 
the user should select the make . sys .plat file which corresponds most closely 
to the platform being used and copy it to the root directory of the distribu- 
tion, renaming it make . sys in the process. The values of system-dependent 
parameters (e.g., the location of the BLAS and LAPACK libraries, the name 
of the Fortran compiler, the Fortran optimisation flags etc.) that are defined 
therein should be modified to correspond to the user's particular system. Once 
this has been done, typing make from the root directory of the distribution 
will create an executable wannier90 .x. Further details and make options may 
be found in the file INSTALL . readme in the root directory of the distribution. 
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constants 

Fig. 2. Schematic structure of the program 
6 Running wannier90 



Before running wannier90 the user must perform a self-consistent first-principles 
calculation on the system of interest in order to obtain a set of Bloch energy 
bands from which MLWF may then be constructed. Once the Bloch bands 
have been computed wannier90 may be operated in a post-processing mode 
as described below. 

The master input file for wannier90 is called seedname . win, where seedname 
is the prefix of all of the input and output files. The input system is designed 
to be simple and user-friendly and is described comprehensively in the docu- 
mentation that is contained within the wannier90 distribution. An example 
input file is provided in Appendix A. 

wannier90 must be run twice. On the first pass the command line option -pp 
must be used, as follows: 

> wannier90.x -pp seedname 

This causes the code to read the master input file seedname .win and generate 
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the file seedname .nnkp that contains all the information necessary to con- 
struct the overlap matrices M^ b ) and A$ n from the Bloch bands already ob- 
tained from a first-principles calculation. In order to interface an ab initio code 
to wannier90 one needs to write subroutines that read seedname .nnkp, com- 
pute the overlap matrix and, optionally, A^from the Bloch bands and 
write these matrices in the appropriate format to files called seedname .mmn 
and seedname . amn, respectively. If using the disentanglement procedure or 
plotting a band structure, density of states or Fermi surface, wannier90 also 
requires the eigenvalues e n k corresponding to the Bloch states -0 n k(r), which 
should be written to a file called seedname . eig. The reader is referred to 
the documentation that is contained within the wannier90 distribution for 
complete details of these files. 

Once the necessary files have been written, wannier90 must be run again, this 
time without any command-line options: 

> wannier90.x seedname 

On this pass, the code reads the overlap matrices and eigenvalues (if required) 
from file and performs the maximal-localisation procedure as outlined in Sec- 
tion 2, writing the output to a file seedname .wout. 

At the time of writing, the PWSCF code (a part of the quantum-espresso 
package [7]), which is available under GNU General Public License, has a 
wannier90 interface in the form of a post-processing program called pw2wannier90. 
It is the authors' hope that wannier90 is a sufficiently useful tool for investi- 
gators to be motivated to write interfaces to other electronic structure codes. 

6.1 Library mode 

wannier90 may also be compiled as a library and invoked with subroutine 
calls from within an ab initio code. The command 

> make lib 

creates a library libwannier.a in the root directory of the distribution. The 
library mode of wannier90 works along exactly the same lines as the post- 
processing mode, described above. The formal difference is that in the latter, 
information is passed between the ab initio code and wannier90 via files, such 
as seedname .nnkp and seedname. mmn etc., whereas, in the former, it is done 
via direct calls to library subroutines. As before, there are two stages: first, a 
call to the library subroutine wannier_setup, which returns the information 
necessary to construct the overlap matrices M^ b ) and A$ n from the Bloch 
bands; second, a call to the subroutine wannier_run, which takes as input the 
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M^ b ) matrix (and A^ n and the eigenvalues e n k, if required) and performs the 
maximal-localisation procedure as outlined in Section 2. The reader is referred 
to the documentation in the wannier90 distribution for further details. 



7 Examples 

7. 1 Graphite 

As an example we consider the generation of MLWF to describe the states 
around and below the Fermi level in Bernal (A-B-A) graphite. From the band- 
structure (Fig. 3) we expect that the minimum number of Wannier functions 
needed to describe these states is 10 per unit cell (2.5 per atom). We shall see 
that this choice corresponds to an intuitive chemical description of the system. 

We perform the initial band-structure calculations using the pwscf code [7]. 
A kinetic-energy cutoff of 30 Ry is used for the plane- wave expansion of the 
valence wavefunctions. The core-valence interaction is described by means of 
norm-conserving pseudopotentials in separable Kleinman-Bylander form [31]. 
We obtain the self-consistent ground state using a 16x16x16 Monkhorst- 
Pack mesh of k-points [22] and a fictitious Fermi smearing [3] of 0.02 Ry for 
the Brillouin-zone integration. Then, we freeze the self-consistent potential 
and perform a non-self-consistent calculation on a uniform 6x6x6 grid of 
k-points. At each k-point we calculate the first 20 bands. The required over- 
lap matrices and projections are calculated using the post-processing rou- 
tine pw2wannier90, supplied with the PWSCF distribution. Projections onto 
atom centred sp 2 and p z functions are used to construct the initial guess, and 
wannier90 is used to obtain the MLWF. The gauge-dependent and gauge- 
independent spreads converge to machine precision in 300 and 70 steps, respec- 
tively. The resulting MLWF are a set of six symmetry-related bond-centred 
cr-orbitals and four atom-centred p 2 -orbitals, shown in Fig. 4 (as plotted with 
the XCrySDen package [33]). Similar MLWF were obtained for carbon nan- 
otubes [14]. In Fig. 3 we show the band structure of graphite obtained using 
Wannier interpolation and compare it to the band structure obtained from a 
full first-principles calculation. Within the inner energy window the agreement 
is essentially perfect. 

7.2 Lead 

Let us consider the generation of MLWF to describe the states around and 
below the Fermi level in lead. As relativistic effects can be significant in heavy 
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r m k r a 

Fig. 3. Band structure of graphite. Solid lines: original band structure from a con- 
ventional first-principles calculation. Dotted lines: Wannier-interpolated band struc- 
ture. The zero of energy is the Fermi level. 




Fig. 4. Isosurface contours of MLWF in graphite (red for positive value and blue for 
negative), [left] a-type MLWF. [right] p^-type MLWF. 

elements, we include spin-orbit coupling in our electronic structure calculation. 
In lead, the 6s and 6p states form an isolated set of bands around the Fermi 
level. The ground-state structure of lead has both time-reversal and inversion 
symmetries, so that each band is two-fold degenerate. We will obtain a set of 
eight MLWF to describe these states. 

Once again we perform the initial band-structure calculations using the PWSCF 
code. A kinetic-energy cutoff of 45 Ry is used for the plane-wave expansion 
of the valence wavef unctions. The core-valence interaction is described by 
means of spin-orbit coupled norm-conserving pseudopotentials [34] in sepa- 
rable Kleinman-Bylander form. We obtain the self-consistent ground state 
using a 16x16x16 Monkhorst-Pack mesh of k-points and a fictitious Fermi 
smearing [3] of 0.02 Ry for the Brillouin-zone integration. The self-consistent 
potential is frozen and we perform a non-self-consistent calculation on a uni- 
form 12x12x12 grid of k-points. Then we calculate the required overlap ma- 
trices and projections using pw2wannier90. For the initial projection we use 
orbitals with sp 3 symmetry, four projections onto spin-up states and four onto 
spin-down states. 

wannier90 is used to obtain the MLWF. The gauge-dependent spread is con- 
verged in 200 steps. In Fig. 5 we show both the band structure and Fermi sur- 
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face obtained using Wannier interpolation. Although the spin-orbit induced 
splitting is large at certain points in the band structure (e.g., 3eV at T), the 
Fermi surface is not significantly different from a scalar-relativistic calculation. 
The reader is referred to Ref. [12] for a detailed discussion of how to construct 
spinor Wannier functions for ferromagnetic systems with broken time-reversal 
symmetry. 




Fig. 5. [left] Band structure of lead. Solid lines: original band structure from a 
conventional first-principles calculation. Dotted lines: Wannier-interpolated band 
structure. The zero of energy is the Fermi level, [right] Fermi surface of lead. 



8 Conclusions 

In conclusion, we present a new code called wannier90 for computing maximally- 
localised Wannier functions. wannier90 is freely available under the GNU 
General Public Licence [35]. It is very user-friendly and is written in Fortran90, 
employing modern programming techniques that enable the addition of further 
functionality, such as transport properties or interpolation of electron-phonon 
couplings, in an easy and modular fashion, wannier 90 has been seamlessly 
interfaced to the PWSCF plane-wave DFT code [7] and, at the time of writing, 
interfaces to other codes, e.g., abinit [36], castep [37] and fleur [38], are 
in progress. We hope that the availability of wannier90 will encourage the 
wider use of maximally-localised Wannier functions in the electronic structure 
community. 
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A Sample files 



Input file: 



num_bands 

num_wann 

dis_win_max 

dis_f roz_max 

dis_num_iter 

num_iter 



20 
10 
19.2 
9.8 
400 
100 



Begin Atoms_Frac 
CI 0.0000000000 
CI 0.3333333333 
C2 0.0000000000 
C2 -0.3333333333 

End Atoms Frac 



0.0000000000 
0.6666666667 
0.0000000000 
-0.6666666667 



0.7500000000 
0.2500000000 
0.2500000000 
0.7500000000 



Begin Unit_Cell_Cart 

2.1304215583 -1.2299994602 
0.0000000000 2.4599989204 
0.0000000000 0.0000000000 

End Unit_Cell_Cart 



0.0000000000 
0.0000000000 
6.8000000000 



Begin Projections 

CI : sp2;pz 

C2:pz 
End Projections 

bands_plot = true 

begin kpoint_path 

G 0.0000 0.0000 0.0000 M 0.5000 -0.5000 0.0000 

M 0.5000 -0.5000 0.0000 K 0.6667 -0.3333 0.0000 

K 0.6667 -0.3333 0.0000 G 0.0000 0.0000 0.0000 

G 0.0000 0.0000 0.0000 A 0.0000 0.0000 0.5000 

end kpoint_path 



mp_grid 



= 666 
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begin kpoints 
0.00000000 
0.00000000 
0.00000000 
0.00000000 
0.00000000 
0.00000000 
0.00000000 



0.00000000 
0.00000000 
0.00000000 
0.00000000 
0.00000000 
0.00000000 
0.16666667 



0.00000000 
0.16666667 
0.33333333 
0.50000000 
0.66666667 
0.83333333 
0.00000000 



0.83333333 0.83333333 0.66666667 
0.83333333 0.83333333 0.83333333 
End Kpoints 



Output file (truncated): 



Final State 



WF 


centre 


and spread 


1 


( 


-0. 


.354954, 


0. 


.614798, 


5. 


.100000 


) 


0, 


,59341185 


WF 


centre 


and spread 


2 


( 


-0. 


.354954, 


-0 


.614798, 


5. 


.100000 


) 


0. 


.59341185 


WF 


centre 


and spread 


3 


( 


0. 


.709907, 


0, 


.000000, 


5, 


.100000 


) 


0. 


.59341190 


WF 


centre 


and spread 


4 


( 


0. 


.000000, 


0, 


.000000, 


5, 


.100000 


) 


1. 


.04236015 


WF 


centre 


and spread 


5 


( 





.354954, 


1 


.845201, 


1. 


.700000 


) 


0. 


.59341120 


WF 


centre 


and spread 


6 


( 





.354954, 





.614798, 


1. 


.700000 


) 


0. 


.59341085 


WF 


centre 


and spread 


7 


( 


1 


.420514, 


1. 


.230000, 


1, 


.700000 


) 





.59341100 


WF 


centre 


and spread 


8 


( 


0. 


.710140, 


1. 


.229999, 


1, 


.700000 


) 


1 


.04371374 


WF 


centre 


and spread 


9 


( 


0, 


.000000, 


0. 


.000000, 


1, 


.700000 


) 


1. 


.04233128 


WF 


centre 


and spread 


10 


( 


-0. 


.710140, 


-1 


.229999, 


5. 


.100000 


) 


1. 


.04368702 


Sum of centres and spreads 


( 


2. 


.130421, 


3, 


.689998, 


33 


.999999 


) 


7, 


.73256084 



Spreads (Ang~2) Omega I = 6.121019675 

================ Omega D = 0.032181361 

Omega 0D = 1.579353494 

Final Spread (Ang~2) Omega Total = 7.732554530 
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